Satrap: Data and Network Heterogeneity Aware P2P Data-Mining

نویسندگان

  • Hock Hee Ang
  • Vivekanand Gopalkrishnan
  • Anwitaman Datta
  • Wee Keong Ng
  • Steven C. H. Hoi
چکیده

Distributed classification aims to build an accurate classifier by learning from distributed data while reducing computation and communication cost. A P2P network where numerous users come together to share resources like data content, bandwidth, storage space and CPU resources is an excellent platform for distributed classification. However, two important aspects of the learning environment have often been overlooked by other works, viz., 1) location of the peers which results in variable communication cost and 2) heterogeneity of the peers’ data which can help reduce redundant communication. In this paper, we examine the properties of network and data heterogeneity and propose a simple yet efficient P2P classification approach that minimizes expensive interregion communication while achieving good generalization performance. Experimental results demonstrate the feasibility and effectiveness of the proposed solution. keywords: Distributed classification, P2P network, cascade SVM.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

P2P Network Trust Management Survey

Peer-to-peer applications (P2P) are no longer limited to home users, and start being accepted in academic and corporate environments. While file sharing and instant messaging applications are the most traditional examples, they are no longer the only ones benefiting from the potential advantages of P2P networks. For example, network file storage, data transmission, distributed computing, and co...

متن کامل

Context-aware Modeling for Spatio-temporal Data Transmitted from a Wireless Body Sensor Network

Context-aware systems must be interoperable and work across different platforms at any time and in any place. Context data collected from wireless body area networks (WBAN) may be heterogeneous and imperfect, which makes their design and implementation difficult. In this research, we introduce a model which takes the dynamic nature of a context-aware system into consideration. This model is con...

متن کامل

Peer-to-Peer Data Mining, Privacy Issues, and Games

Peer-to-Peer (P2P) networks are gaining increasing popularity in many distributed applications such as file-sharing, network storage, web caching, searching and indexing of relevant documents and P2P network-threat analysis. Many of these applications require scalable analysis of data over a P2P network. This paper starts by offering a brief overview of distributed data mining applications and ...

متن کامل

Distributed Frequent Item Sets Mining over P2P Networks

Data intensive peer-to-peer (P2P) networks are becoming increasingly popular in applications like social networking, file sharing networks, etc. Data mining in such P2P environments is the new generation of advanced P2P applications. Unfortunately, most of the existing data mining algorithms do not fit well in such environments since they require data that can be accessed in its entirety. It al...

متن کامل

Scaling Unstructured Peer-to-Peer Networks with Heterogeneity-Aware Topology And Routing

Peer-to-peer (P2P) file sharing systems such as Gnutella have been widely acknowledged as the fastest growing Internet applications ever. The P2P model has many potential advantages including high flexibility and server-less management. However, these systems suffer from the well-known performance mismatch between the randomly constructed overlay network topology and the underlying IP-layer top...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010